## Time-aware models on the Leaderboard {: #time-aware-models-on-the-leaderboard }

Once you click **Start**, DataRobot begins the model-building process and returns results to the Leaderboard.

!!! note
    Model parameter selection has not been customized for date/time-partitioned projects. Though automatic parameter selection yields good results in most cases, [**Advanced Tuning** ](adv-tuning) may significantly improve performance for some projects that use the Date/Time partitioning feature.

While most elements of the Leaderboard are the same, DataRobot's calculation and assignment of [recommended models](model-rec-process) differs. Also, the **Sample Size** function is different for date/time-partitioned models. Instead of reporting the percentage of the dataset used to build a particular model, under **Feature List & Sample Size**, the default display lists the sampling method (random/latest) and either:

* The start/end date (either manually added or automatically assigned for the recommended model:

	![](images/otp-start-end-lb.png)

* The duration used to build the model:

	![](images/otp-sample-size.png)


* The number of rows:

	![](images/otp-rows-lb.png)

* the **Project Settings** label, indicating custom backtest configuration:

	![](images/otp-ps-lb.png)

You can filter the Leaderboard display on the time window sample percent, sampling method, and feature list using the dropdown available from the **Feature List & Sample Size**. Use this to, for example, easily select models in a single Autopilot stage.

![](images/otp-lb-filter.png)


Autopilot does not optimize the amount of data used to build models when using Date/Time partitioning. Different length training windows may yield better performance by including more data (for longer model-training periods) or by focusing on recent data (for shorter training periods). You may improve model performance by adding models based on shorter or longer training periods. You can customize the training period with the <b>Add a Model</b> option on the Leaderboard.

Another partitioning-dependent difference is the origination of the Validation score. With date partitioning, DataRobot initially builds a model using only the first backtest (the partition displayed just below the holdout test) and reports the score on the Leaderboard. When calculating the holdout score (if enabled) for row count or duration models, DataRobot trains on the first backtest, freezes the parameters, and then trains the holdout model. In this way, models have the same relationship (i.e., end of backtest 1 training to start of backtest validation will be equivalent in duration to end of holdout training data to start of holdout).

Note, however, that backtesting scores are dependent on the [sampling method](#set-rows-or-duration) selected. DataRobot only scores all backtests for a limited number of models (you must manually run others). The automatically run backtests are based on:

* With *random*, DataRobot always backtests the best blueprints on the max available sample size. For example, if `BP0 on P1Y @ 50%` has the best score, and BP0 has been trained on `P1Y@25%`, `P1Y@50%` and `P1Y` (the 100% model), DataRobot will score all backtests for BP0 trained on P1Y.

* With *latest*, DataRobot preserves the exact training settings of the best model for backtesting. In the case above, it would score all backtests for `BP0 on P1Y @ 50%`.

Note that when the model used to score the validation set was trained on less data than the training size displayed on the Leaderboard, the score displays an asterisk. This happens when training size is equal to full size minus holdout.

Just like [cross-validation](data-partitioning), you must initiate a separate build for the other configured backtests (if you initially set the number of backtest to greater than 1). Click a model’s **Run** link from the Leaderboard, or use **Run All Backtests for Selected Models** from the Leaderboard menu. (You can use this option to run backtests for single or multiple models at one time.)

![](images/otp-run.png)

The resulting score displayed in the **All Backtests** column represents an average score for all backtests. See the description of [**Model Info**](model-info) for more information on backtest scoring.

![](images/otp-run-value.png)

### Change the training period {: #change-the-training-period }

!!! note
    Consider [retraining your model on the most recent data](otv#retrain-before-deployment) before final deployment.

You can change the training range and sampling rate and then rerun a particular model for date-partitioned builds. Note that you cannot change the duration of the validation partition once models have been built; that setting is only available from the **Advanced options** link before the building has started. Click the plus sign (**+**) to open the **New Training Period** dialog:

![](images/otp-open-training.png)

The **New Training Period** box has multiple selectors, described in the table below:

![](images/otp-new-training.png)

|   | Selection | Description |
|---|---|---|
| ![](images/icon-1.png) | Frozen run toggle  | [Freeze the run](frozen-run) |
| ![](images/icon-2.png)  | Training mode   | Rerun the model using a different training period. Before setting this value, see [the details](ts-customization#duration-and-row-count) of row count vs. duration and how they apply to different folds. |
| ![](images/icon-3.png)  | Snap to  | "Snap to" predefined points, to facilitate entering values and avoid manually scrolling or calculation. |
| ![](images/icon-4.png)  | [Enable time window sampling](#time-window-sampling) | Train on a subset of data within a time window for a duration or [start/end](#setting-the-start-and-end-dates) training mode. Check to enable and specify a percentage. |
| ![](images/icon-5.png)  | [Sampling method](#set-rows-or-duration)   | Select the sampling method used to assign rows from the dataset. |
|![](images/icon-6.png)   | Summary graphic | View a summary of the observations and testing partitions used to build the model. |
|  ![](images/icon-7.png) | Final Model  | View an image that changes as you adjust the dates, reflecting the data to be used in the model you will make predictions with (see the [note](#about-final-models) below). |


Once you have set a new value, click **Run with new training period**. DataRobot builds the new model and displays it on the Leaderboard.

#### Setting the duration {: #setting-the-duration}

To change the training period a model uses, select the **Duration** tab in the dialog and set a new length. Duration is measured from the beginning of validation working back in time (to the left). With the Duration option, you can also enable [time window sampling](#time-window-sampling).

DataRobot returns an error for any period of time outside of the observation range. Also, the units available depend on the time format (for example, if the format is `%d-%m-%Y`, you won't have hours, minutes, and seconds).

![](images/otp-duration.png)

#### Setting the row count {: #setting-the-row-count }

The row count used to build a model is reported on the Leaderboard as the Sample Size. To vary this size, Click the **Row Count** tab in the dialog and enter a new value.

![](images/otp-row-count.png)

#### Setting the start and end dates {: #setting-the-start-and-end-dates }

If you enable [Frozen run](frozen-run) by clicking the toggle, DataRobot re-uses the parameter settings it established in the original model run on the newly specified sample. Enabling Frozen run unlocks a third training criteria, Start/End Date. Use this selection to manually specify which data DataRobot uses to build the model. With this setting, after unlocking holdout, you can train a model into the Holdout data. (The Duration and Row Count selectors do not allow training into holdout.) Note that if holdout is locked and you overlap with this setting, the model building will fail. With the start and end dates option, you can also enable [time window sampling](#time-window-sampling).

![](images/otp-start-end.png)

When setting start and end dates, note the following:

* DataRobot does not run backtests because some of the data may have been used to build the model.
* The end date is excluded when extracting data. In other words, if you want data through December 31, 2015, you must set end-date to January 1, 2016.
* If the validation partition (set via Advanced options before initial model build) occurs after the training data, DataRobot displays a validation score on the Leaderboard. Otherwise, the Leaderboard displays N/A.
* Similarly, if any of the holdout data is used to build the model, the Leaderboard displays N/A for the Holdout score.
* Date/time partitioning does not support dates before 1900.

Click **Start/End Date** to open a clickable calendar for setting the dates. The dates displayed on opening are those used for the existing model. As you adjust the dates, check the **Final model** graphic to view the data your model will use.

![](images/otp-final-model.png)

### Time window sampling {: #time-window-sampling }

If you do not want to use all data within a time window for a date/time-partitioned project, you can train on a subset of data within a time window specification. To do so, check the **Enable Time Window** sampling box and specify a percentage. DataRobot will take a uniform sample over the time range using that percentage of the data. This feature helps with larger datasets that may need the full time window to capture seasonality effects, but could otherwise face runtime or memory limitations.

 ![](images/otp-time-sample.png)

## View summary information {: #view-summary-information }

Once models are built, use the [**Model Info**](model-info) tab for the model overview, backtest summary, and resource usage information.

![](images/otp-model-info.png)

Some notes:

* Hover over the folds to display rows, dates, and duration as they may differ from the values shown on the Leaderboard. The values displayed are the actual values DataRobot used to train the model. For example, suppose you request a [Start/End Date](#setting-the-start-and-end-dates) model from 6/1/2015 to 6/30/2015 but there is only data in your dataset from 6/7/2015 to 6/14/2015, then the hover display indicates the actual dates, 6/7/2015 through 6/15/2015, for start and end dates, with a duration of eight days.

* The **Model Overview** is a summary of row counts from the validation fold (the first fold under the holdout fold).

* If you created duration-based testing, the validation summary could result in differences in numbers of rows. This is because the number of rows of data available for a given time period can vary.

* A message of **Not Yet Computed** for a backtest indicates that there was not available data for the validation fold (for example, because of gaps in the dataset). In this case, where all backtests were not completed, DataRobot displays an asterisk on the backtest score.

* The “reps” listed at the bottom correspond to the backtests above and are ordered in the sequence in which they finished running.
